Estimating the Maximum Expected Value through Gaussian Approximation
نویسندگان
چکیده
This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other stateof-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.
منابع مشابه
Estimating Maximum Expected Value through Gaussian Approximation
Theorem 2. If we compare the expected value of DE reported in Equation (4) with the value of the estimator WE in Equation (3), we can notice strong similarities. The main difference is that in DE the sample mean of variable Xi and its probability of being the maximum are computed w.r.t. two independent set of samples, while in WE these two quantities are positively correlated. It follows that W...
متن کاملApproximations to the Loglikelihood Function in the Nonlinear Mixed Effects Model
Nonlinear mixed effects models have received a great deal of attention in the statistical literature in recent years because of the flexibility they offer in handling unbalanced repeated measures data that arise in different areas of investigation, such as pharmacokinetics and economics. Several different methods for estimating the parameters in nonlinear mixed effects model have been proposed....
متن کاملWidth invariant approximation of fuzzy numbers
In this paper, we consider the width invariant trapezoidal and triangularapproximations of fuzzy numbers. The presented methods avoid the effortful computation of Karush-Kuhn-Tucker Theorem. Some properties of the new approximation methods are presented and the applicability of the methods is illustrated by examples. In addition, we show that the proposed approximations of fuzzy numbers preserv...
متن کاملExpected Duration of Dynamic Markov PERT Networks
Abstract : In this paper , we apply the stochastic dynamic programming to approximate the mean project completion time in dynamic Markov PERT networks. It is assumed that the activity durations are independent random variables with exponential distributions, but some social and economical problems influence the mean of activity durations. It is also assumed that the social problems evolve in ac...
متن کاملDensity Estimation Through ConvexCombinations of Densities ; Approximation andEstimation
We consider the problem of estimating a density function from a sequence of independent and identically distributed observations x i taking value in R d. The estimation procedure constructs a convex mixture of`basis' densities and estimates the parameters using the maximum likelihood method. Viewing the error as a combination of two terms, the approximation error measuring the adequacy of the m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016